NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Enabling Portable and High-Performance SmartNIC Programs with Alkali

Lin, Jiaxin; Guo, Zhiyuan; Shah, Mihir; Ji, Tao; Zhang, Yiying; Kim, Daehyeok; Akella, Aditya (April 2025, USENIX NSDI)

Free, publicly-accessible full text available April 28, 2026
Enabling Portable and High-Performance SmartNIC Programs with Alkali

Lin, Jiaxin Lin; Guo, Zhiyuan; Shah, Mihir; Ji, Tao; Zhang, Yiying; Kim, Daehyeok Kim; Akella, Aditya (April 2025, USENIX)

Trends indicate that emerging SmartNICs, either from different vendors or generations from the same vendor, exhibit substantial differences in hardware parallelism and memory interconnects. These variations make porting programs across NICs highly complex and time-consuming, requiring programmers to significantly refactor code for performance based on each target NIC’s hardware characteristics. We argue that an ideal SmartNIC compilation framework should allow developers to write target-independent programs, with the compiler automatically managing cross-NIC porting and performance optimization. We present such a framework, Alkali, that achieves this by (1) proposing a new intermediate representation for building flexible compiler infrastructure for multiple NIC targets and (2) developing a new iterative parallelism optimization algorithm that automatically ports and parallelizes the input programs based on the target NIC’s hardware characteristics. Experiments across a wide range of NIC applications demonstrate that Alkali enables developers to easily write portable, high-performance NIC programs. Our compiler optimization passes can automatically port these programs and make them run efficiently across all targets, achieving performance within 9.8% of hand-tuned expert implementations.
more » « less
Free, publicly-accessible full text available April 28, 2026
LogNIC: A High-Level Performance Model for SmartNICs

https://doi.org/10.1145/3613424.3614291

Guo, Zerui; Lin, Jiaxin; Bai, Yuebin; Kim, Daehyeok; Swift, Michael; Akella, Aditya; Liu, Ming (October 2023, ACM)
Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs

https://doi.org/10.1145/3589980

Lin, Jiaxin; Ji, Tao; Hao, Xiangpeng; Cha, Hokeun; Le, Yanfang; Yu, Xiangyao; Akella, Aditya (May 2023, Proceedings of the ACM on Measurement and Analysis of Computing Systems)

The wide adoption of the emerging SmartNIC technology creates new opportunities to offload application-level computation into the networking layer, which frees the burden of host CPUs, leading to performance improvement. Shuffle, the all-to-all data exchange process, is a critical building block for network communication in distributed data-intensive applications and can potentially benefit from SmartNICs. In this paper, we develop SmartShuffle, which accelerates the data-intensive application's shuffle process by offloading various computation tasks into the SmartNIC devices. SmartShuffle supports offloading both low-level network functions, including data partitioning and network transport, and high-level computation tasks, including filtering, aggregation, and sorting. SmartShuffle adopts a coordinated offload architecture to make sender-side and receiver-side SmartNICs jointly contribute to the benefits of shuffle computation offload. SmartShuffle carefully manages the tight and time-varying computation and memory constraints on the device. We propose a liquid offloading approach, which dynamically migrates operators between the host CPU and the SmartNIC at runtime such that resources in both devices are fully utilized. We prototype SmartShuffle on the Stingray SoC SmartNICs and plug it into Spark. Our evaluation shows that SmartShuffle improves host CPU efficiency and I/O efficiency with lower job completion time. SmartShuffle outperforms Spark, and Spark RDMA by up to 40% on TPC-H.
more » « less
Full Text Available

Search for: All records